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Abstract 

Triploid F1 hybrids generated via reciprocal interploidy crosses between genetically distinct parental plants 
can display parent-of-origin effects on gene expression or phenotypes. Reciprocal triploid F1 isogenic plants 
generated from interploidy crosses in the same genetic background allow investigation on parent-of-origin- 
specific (parental) genome-dosage effects without confounding effects of hybridity involving heterozygous 
mutations. Whole-genome transcriptome profiling was conducted on reciprocal F1 isogenic triploid ( 3x) seed- 
lings of A. thaliana. The genetically identical reciprocal 3x genotypes had either an excess of maternally inher- 
ited 3x(m) or paternally inherited 3x( p) genomes. We identify a major parent-of-origin-dependent genome- 
dosage effect on transcript levels, whereby 602 genes exhibit differential expression between the reciprocal 
F1 triploids. In addition, using methylation-sensitive DNA tiling arrays, constitutive and polymorphic CG 
DNA methylation patterns at CCGG sites were analysed, which revealed that paternal-excess F1 triploid seedling 
C m CGG sites are overall hypermethylated. However, no correlation exists between C m CGG methylation poly- 
morphisms and transcriptome dysregulation between the isogenic reciprocal F1 triploids. Overall, our study 
indicates that parental genome-dosage effects on the transcriptome levels occur in paternal-excess triploids, 
which are independent of C m CGG methylation polymorphisms. Such findings have implications for under- 
standing parental effects and genome-dosage effects on gene expression and phenotypes in polyploid plants. 
Keywords: triploid; DNA methylation; parent-of-origin effect; polyploidy; epigenetic 



1 . Introduction 

Changes in gene dosage at the whole-genome, 
chromosomal orsegmental levels can elicit phenotypic 
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and gene expression effects associated with dosage- 
sensitive genes. 1 Polyploidization events increase the 
dosage of all loci, including structural and regulatory 
loci, controlling traits that may be genome dosage 
sensitive. Due to the importance of polyploidy to 
plant evolution and crop breeding, many aspects of 
gene regulation in polyploids require elucidation, 2,3 
including how dosage effects and other epigenetic 
changes are triggered and maintained in both allo- 
and autopolyploids. 4,5 

A range of studies in allopolyploid plant genomes 
have revealed rapid epigeneticchanges, including alter- 
ation in cytosine methylation patterns, rapid silencing 
of ribosomal RNA and protein-coding genes, and 
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de-repression of dormant transposable elements. 6,7 
However, allopolyploid genomes are genetic hybrids, 
where both ploidy (genome dosage) and hybridity 
(mutational differences) occur in concert, making it 
difficult to disentangle genetic effects from genome 
(and gene)-dosage effects. 

In contrast, the generation of autopolyploids in the 
same genetic background provides a model system for 
the analysis of genome-dosage effects {sensu strictu) in 
the absence of mutational differences between lines 
of different ploidy. In autopolyploids, there have been 
a range of studies on genome-dosage effects on pheno- 
types in maize 8,9 and A. thaiiana} Q ~ yl Changes in 
genome dosage in autopolyploids, and also in individ- 
ual gene dosage, have been shown to modify epigenetic 
silencing in plants. 9,1 3 The majority of studies on gene 
expression changes in autopolyploids conducted, to 
date, have either focused on a limited number of 
genes' 2,14,1 5 or found few gene (transcript) expression 
level changes at the whole-genome level between 
diploids and tetraploids of/4, thaliana} 6 '^ 7 

In addition, little is known regarding the extent of 
DNA methylation changes associated with autopoly- 
ploidy in A. thaliana or the functional effects of DNA 
methylation polymorphisms on gene expression or 
phenotypic changes. In diploids, it has been previously 
shown in A. thaliana methyltransferase mutants 
(drm1, drm2, cmt3 and metl) that loss of methylation 
can lead to upregulation of genes, 1 8-20 and conversely 
that increased methylation can lead todown regulation 
in A thaliana hybrids. 20 

In interploidy 2x x 4x crosses, genome-dosage 
effects can occur in a parent-of-origin-dependent or 
-independent manner, depending on whether the 
two different types of reciprocal F1 triploids (i.e. 
2m:1p versus 1m:2p) display different phenotypes. 
While parent-of-origin-dependent genome-dosage 
effects have been observed on phenoypes in maize and 
A. thaliana, little is known regarding parent-of-origin- 
dependent genome-dosage effects on gene expression 
in reciprocal F1 triploids. The generation of reciprocal 
F1 triploids in the same genetic background provides a 
system for the identification and analysis of genome- 
dosage and other epigenetic parental effects on 
phenotypes and gene expression, which are not due 
to mutational differences between the reciprocal F1 
triploids. 

In this study, we have used isogenic reciprocal F1 
triploids to demonstrate a major parent-of-origin- 
dependent (i.e. parental) genome-dosage-dependent 
effect on transcript levels in paternal genome excess 
F1 triploids. We also demonstrate that this novel paren- 
tal genome-dosage effect in the F1 triploids is C m CGG 
methylation-independent at the whole-genome level. 
This suggests that the paternally and maternally inher- 
ited chromosome sets in autopolyploid plants may 



be epigenetically different, due to parental genome- 
dosage effects that can affect transcript levels in a 
C m CGG methylation-independent manner. 

2. Materials and methods 

2.1. Plant materials 

Col-0 diploid (2x) and tetraploid (4x) seeds, selfed for 
at least two generations after colchicine treatment, were 
the kind gift of Luca Comai (University of Washington). 
Maternal-excess triploids (3x(m)) and paternal-excess 
triploids (3x(p)) were generated by manually crossing 
emasculated diploid or tetraploid flowers with diploid 
ortetraploid pollen undera Leica MZ6 dissection micro- 
scope using Dumostar No. 5 tweezers. Sterilized seeds 
were sown on 0.5X MS (Murashige and Skoog) media 
and grown in a Percival growth chamber (1 6 h light 
and 8 h darkness). Seedlings were harvested at the two 
true leaf stage (Boyes standard 1 .02 21 ) for subsequent 
analysis. The ploidy of the resulting crosses was verified 
by flow cytometry using a Partec Ploidy Analyzer, with 
CyStain UV Precise P (Partec) reagents, following manu- 
facturer's instructions. 

2.2. Sample preparation and microarray hybridization 
RNAfrom four biological replicates per ploidy level, 

with 20 seedlings per replicate, was extracted using 
the QIAGEN RNeasy Plant Mini Kit (#74 903). mRNA 
was purified with QIAGEN Oligotex mRNA Mini Kits 
(#70 022) using 25 |xg of initial RNA, and used for 
double-stranded cDNA synthesis (Superscript Double- 
Stranded cDNA synthesis #11917-01 0). After RNAse 
treatment (Epicentre RNAseH) and dscDNA purifica- 
tion (QIAGEN Qiaquick PCR purification kit), samples 
were labelled using Invitrogen BioPrime DNA labelling, 
and processed. For methylation analysis, DNA was 
extracted using the QIAGEN DNeasy Plant Mini Kit 
(#69 1 04), and 300 ng of DNA digested by 20 units 
of Mse\ and 10 units of either HpaW or Msp\ (New 
England Biolabs) at 3 7°C for 1 6 h. After ethanol 
precipitation of digested DNA, the samples were 
labelled with Invitrogen BioPrime DNA labelling. All 
kits were used according to manufacturer's instruc- 
tions. Transcriptome and methylome analyses were 
performed using a custom whole-genome, SNP-tiling 
array (AtSNPtilel). 22 

2.3. Tiling array and small RNA data analysis 

The tiling array used was the AtSNPtilel -tiling array 
which contains 1.4 million unique probes tiled 
along both strands of the entire A. thaliana genome 
at 35 bp resolution. The tiling probes include all 
unique features with good hybridization quality on 
the A. thaliana-t\\\ng array 1.0 (Affymetrix). The ana- 
lysis of the tiling array data for the detection of indels, 
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gene expression and DNA methylation differences, 
including the validation of gene expression difference 
(by qRT-PCR) and correlation between DNA methyla- 
tion, small RNA and gene expression is described in 
Supplementary Data. 



(i.e. a 4x x 2x cross) to generate the reciprocal maternal- 
excess F1 triploids (3x(m)). The comparison of these 
genetically identical 3x(p) and 3x(m) plants formed 
the basis of our analysis of parent-of-origin genome- 
dosage effects on the transcriptome. 



3. Results and discussion 

3.1 . Generation of isogenic F1 triploid plants 

Reciprocal crosses of tetraploid (4x) and diploid (2x) 
parental lines in the same accession background can 
generate viable reciprocal F1 triploids in A. thaliana 
that are genetically identical, i.e. they are isogenic 
(Fig. 1). Such reciprocal F1 triploids provide an ideal 
model system to investigate parent-of-origin-specific 
(parental) effects on gene expression and other pheno- 
types. 23 To identify any parent-of-origin-specific 
genome-dosage effects on gene expression between 
isogenic reciprocal F1 triploids of A. thaliana (accession 
Col-0), microarray profiling was performed on 
A. thaliana seedlings. The reciprocal F1 triploids were 
generated from neo-tetraploid (F2) plants (in the 
Col-0 accession background) that were reciprocally 
crossed in both parental directions (as either pollen or 
ovule donor) to the diploid progenitor line. Paternal- 
excess F1 triploids (3x(p)), each containing two copies 
of the paternal genome and one copy of the maternal 
genome, were generated by fertilizing emasculated 
diploid flowers with pollen from the tetraploid (i.e. a 
2x x 4x cross). In contrast, emasculated tetraploid 
flowers were fertilized with pollen from the diploids 
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Figure 1. Generation of isogenic F1 reciprocal triploid plants. (A) 
Generation of tetraploid A. thaliana Col-0 using colchicine 
treatment. (B and C) Generation of maternal- and paternal- 
excess reciprocal F1 triploids by crossing diploid and tetraploid 
A. thaliana plants. 



3.2. High-density tiling array analysis confirms that 
isogenic reciprocal F1 triploids generated from 
interploidy crosses in A. thaliana are genetically 
identical 

In a selfing species, such as A. thaliana, the crossing of 
diploid and tetraploid plants that are in the same 
genetic background should result in the generation of 
F1 triploid plants that contain the same DNA sequence 
(i.e. are isogenic in terms of DNA sequence) as the par- 
ental lines. Such isogenic ploidy series systems allow us 
to test for strict genome-dosage effects on gene expres- 
sion and other phenotypes. In addition, the generation 
of isogenic F1 triploids provides a model system for 
detecting parent-of-origin-specific genome-dosage 
effects. To confirm that reciprocal F1 triploids gener- 
ated from interploidy crosses are indeed isogenic, we 
hybridized genomic DNA to the tiling arrays and 
screened for any evidence of single feature polymorph- 
isms (SFPs) 24 by comparing individual probe intensities 
across the reciprocal F1 triploids. NoSFPs were detected 
in the isogenic F1 triploids. 

As this approach may not have detected longer indels 
spanning several tiling array probes, we also compared 
the two triploid datasets using a segmentation algo- 
rithm. 25 Only one potential indel (Chr2: 1 3 390 565- 
13 391 067) was predicted using this approach. When 
we subsequently tested this by PCR using primers 
designed to span the location of the putative indel, no 
such indel could be detected (Supplementary Fig. S1), 
indicating that this was a false positive from the segmen- 
tation algorithm. Overall, this analysis confirmed that 
the reciprocal F1 triploid plants are genetically isogenic 
and hence, provide a robust platform for the investiga- 
tion of parent-of-origin-specific genome-dosage effects 
in plants. 

3.3. Isogenic reciprocal F1 triploid plants display 
epigenetic parent-of-origin-specific genome-dosage 
effects on transcript levels 

Having confirmed that the reciprocal F1 triploids 
generated from reciprocal interploidy crosses were 
truly isogenic, the tiling arrays were used to interrogate 
the expression levels of 25 703 genes in seedlings. 
Despite the fact that the reciprocal F1 triploids are gen- 
etically identical at the DNA sequence level, 602 genes 
were found to be differentially expressed between the 
reciprocal F1 triploids [using a false discovery rate 
(FDR) of 4.74 x 1 0 e ~ 3 ]. All the 602 differentially 
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expressed genes were identified as having fold changes 
greater than 2.5 (Supplementary Table S1 ). All these 
602 genes were upregulated in the paternal-excess 
3x(p) F1 triploid compared with the maternal-excess 
3x(m) F1 triploid. The upregulation of gene 
expression levels in 3x(p) versus 3x(m) F1 triploids 
was validated by qRT-PCR for 15 of 2 0 genes tested 
(75%) (Supplementary Fig. S2). 

Overall, our results using isogenic F1 triploids provide 
the first evidence of widespread parent-of-origin-spe- 
cific genome-dosage effects on gene expression in trip- 
loid vegetative plant tissues. While parent-of-origin- 
specific expression (e.g. due to genomic imprinting 26 ) 
has been detected in triploid endosperm tissues, 27-29 
previous studies have found no evidence for parent- 
of-origin effects on gene expression in diploid vegeta- 
tive tissues. 22 



3.4. The 6 02 genes subject to parent-of-origin-specific 
genome-dosage effects are enriched for stress- 
response genes 

To investigate the biological processes associated 
with the altered transcriptome of the 3x(p) triploid F1 
seedlings, we screened for Gene Ontology (GO) terms 
enriched among the 602 upregulated genes by condi- 
tional hypergeometric tests. 30 In total, 872 biological 
processes (BPs), 546 molecular functions and 230 
cellular components (CC) were tested for this analysis. 
The BP analysis discovered a significant parent-of-origin- 
specific genome-dosage effect on stress-response genes, 
with several stress-response terms significantly over- 
represented in the genes upregulated in the 3x(p) F1 
triploid, including both biotic and abiotic stress 
responses (Supplementary Table S2). 

Tofurther investigatetheenrichmentfor both abiotic 
and biotic stress-response genes in the 3x(p) triploid, 
the AtGenExpress abiotic stress, pathogen infection, 
growth conditions, hormone and chemical treatment 
datasets were also interrogated. For this analysis, a list 
of all differentially expressed genes was identified for 
each treatment-tissue-time-point combination in 
the AtGenExpress datasets, and these AtGenExpress 
lists were then compared with the 602 up-regulated 
genes in the 3x(p) triploid to identify significant over- 
laps. Using this approach, for each abiotic stress, tested 
(cold, salt, heat, osmotic, genotoxic, wounding, oxida- 
tive, drought and UV) significant overlaps were detected 
with the 602 genes identified in the 3x(p) triploid 
(Supplementary TableS3). In addition, significant over- 
laps were also detected between the 602 3x(p) genes 
and biotic stress- response gene sets. These included 
genes responsive to several pathogens, including 
Pseudomonas, potato blight, virulent, avirulent, Type III 
secretion system-deficient and non-host bacterial 
pathogens, bacterial derived elicitors [LPS, HrpZ, Flg22 



and oomycete (NPP1)] and mildew infection 
(Supplementary Table S3). While genome-dosage 
effects on transcript levels have been observed 
between diploid and tetraploid A. thaliana^ 6 ^ 7,3 ^ 32 
and tetraploid lines observed to have increased salinity 
tolerance, 33 to our knowledge, this is the first time that 
a parent-of-origin-specific genome-dosage effect on 
transcript levels of biotic and abiotic stress- response 
genes has been demonstrated. 

3.5. Paternal-excess isogenic F 1 triploids display 
elevated m CC methylation at CCGG sites 

It is commonly considered that genome-wide tran- 
script expression is correlated with gene cytosine 
methylation, 1 8 > 34 > 35 which, in plants, most commonly 
occurs in the CG context. 36 To determine whether the 
genome-wide distribution of CG DNA methylation 
differed between the reciprocal isogenic F1 triploids, 
tiling array analysis of CG methylation in a CCGG 
context was performed. This analysis was performed 
using the same tiling array platform that was used for 
the transcriptome analysis, allowing for direct locus- 
by-locus comparisons to be made. 

Briefly, DNA was extracted from matched sam pies (i.e. 
the four biological replicates used in the gene expres- 
sion study) and the DNA digested with either of the 
DNA methylation-sensitive restriction enzymes Msp\ 
or HpaW, both of which recognize CCGG restriction 
enzyme sites. Msp\ is insensitive to methylation at the 
internal cytosine (i.e. C m CGG) and will cut the CCGG 
site whether the internal C sequence is methylated or 
not. In contrast, HpaW will only cut CCGG when the 
internal cytosine is unmethylated (i.e. CCGG). 
Therefore, increased signal intensity at probes contain- 
ing CCGG sequences in the /-/poll-digested samples 
indicates methylation at the internal C site (i.e. 
C m CGG). 

To scan the genome for CG methylation polymorph- 
isms between the reciprocal F1 triploids, a total of 75 
943 CCGG sites interrogated on the array were each 
tested for Msp\ versus HpaW intensity differences, and 
a linear, mixed-effects model was implemented to 
detect both constitutive and polymorphic methylation 
(see Materials and Methods, and ref. 37 ). In this experi- 
ment, constitutive methylated CCGG sites are those in 
which internal cytosine methylation is observed in 
both of the isogenic reciprocal F1 triploids. In contrast, 
polymorphic methylated CCGG sites are those in 
which internal cytosine methylation is observed in 
only one of the isogenic reciprocal F1 triploids. Using 
this approach, 8008 CCGG sites were identified as 
methylated at the internal cytosine (i.e. C m CGG) in 
both of the reciprocal F1 triploids (FDR = 5.64 x 
1 0" 3 ). In contrast, a further 5644 sites (7.4% of the 
total 75 943 sites scanned) displayed polymorphic 
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internal cytosine methylation, indicating that these 
internal cytosine sites were differentially methylated 
between the genetically identical F1 triploids. Of 
these, 3587 (63.6%) were methylated in the 3x(p) 
but not the 3x(m) (FDR = 3.54 x 1 0" 3 ), whereas 
the remaining 2057 (36.4%) showed the opposite 
pattern and were methylated in the 3x(m) (FDR = 
5.77 x10" 3 ) but not the 3x(p). Of the 13 652 
C m CGG sites detected across the genome, 41 .3% were 
polymorphic between the reciprocal F1 triploids. 

The extensive differential C m CGG methylation 
between the isogenic reciprocal F1 triploids can be 
due to parental genome-dosage effects on m CG estab- 
lishment or maintenance. Indeed, it is likely that there 
are parental genome-dosage effects on both of these 
DNA methylation mechanisms, as the 3587 could be 
neomethylated in the paternal-excess 3x, whereas the 
2057 could be demethylated in the paternal excess 
(or vice versa). Our results suggested that RNA-directed 
DNA methylation (RdDM; via AG04 and DRM2) at 
some loci could be subject to parent-of-origin-specific 
genome-dosage effects on CG methylation establish- 
ment. 38 However, the transcript levels of AG04 and 
DRM2 do not differ between the reciprocal F1 triploids, 
indicating that the parent-of-origin-specific genome- 
dosage effects on m CG methylation polymorphism are 
not due to AG04/DRM2 transcript levels affecting 
m CG methylation establishment. 

Our results could also indicate that maintenance of 
m CG methylation at some sites via MET1 , DDM 1 and/ 
or VIM1-3 can be subject to parental genome-dosage 
effects. 38 However, the transcript levels of MET1 , 
DDM1 and/or VIM1-3 do not differ between the recip- 
rocal F1 triploids, indicating that the parent-of-origin- 
specific genome-dosage effects on m CG methylation 
polymorphism are not due to different transcript 
levels of MET1 , DDM1 and/or VIM1-3 affecting m CG 
methylation maintenance. Alternately, the m CG 
methylation polymorphism observed could be due to 
parental genome-dosage effects on establishment or 
maintenance pathways that are operating at a post- 
transcriptional level. It is important to consider that 
the assay used does not interrogate all the CG sites in 
the genome, i.e. the assay is restricted to those CG 
sites in a CCGG context. Any conclusions made from 
our results hence relate to CCGG sites, and not to all 
CG sites in the genome. However, it is possible that 
the CCGG data subset may be representative of overall 
CG methylation (see below). 

3.6. Parental genome-dosage effect polymorphic 

C m CCC methylation is uniformly distributed across 
chromosomal regions in reciprocal F1 triploids 
Todetermine whetherthe polymorphic m CG methy- 
lation at CCGG sites wasclustered on any chromosomes 



or chromosomal regions, the genome-wide distribu- 
tion of constitutive C m CGG methylation was analysed 
across the five A. thaliana chromosomes. This deter- 
mined that the constitutively methylated C m CGG sites 
are concentrated in pericentromeric regions, while 
depleted in the middle of the chromosome arms 
(Fig. 2A). This pattern for constitutive C m CGG methyla- 
tion is similar to the distribution previously reported in 
diploids, 34,37 including assays evaluating m C in all con- 
texts. 33 In contrast, the polymorphic C m CGG methyla- 
tion sites that are subject to parental genome-dosage 
effects (i.e. those which differ between 3x(m) and 
3x(p) triploids) are much more uniformly distributed 
across the chromosomes (Fig. 2A). 

As DNA methylation changes could have either 
site-specific or broader effects on local chromatin struc- 
ture, 39 we further investigated whetherthe DNA methy- 
lation polymorphisms were clustered across large tracts 
of sequence rather than at individual sites (as our 
method identifies). Hence, LOWESS smoothing was per- 
formed on the d scores using 200 kb discrete windows 
for both constitutive and polymorphic C m CGG methy- 
lated sites. LOWESS smoothing performs locally 
weighted regression on neighbouring d scores within 
the window such that each d score is adjusted to 
reflect the overall pattern displayed by its neighbours. 
This allows identification of regional methylation pat- 
terns along chromosomes when compared with a null 
distribution. After applying LOWESS smoothing, consti- 
tutive C m CGG methylation continued to be preferential- 
ly located around the pericentromeres, indicating that 
constitutively methylated sites are clustered in these 
regions (Fig. 2B). In contrast, the LOWESS-smoothed d 
scores for polymorphic C m CGG methylation sites are 
largely indistinguishable from those of the null distribu- 
tion (Fig. 2C). This indicates that, in chromosome 
regions distal from the centromeres, polymorphic and 
constitutive C m CGG methylation sites do not show any 
patterns specific to large chromosomal tracts (i.e. at 
200 kb windows or multiples thereof), and that 
the CCGG sites, which are differentially methylated 
between F1 triploids, are therefore likely to be discretely 
located at a resolution of <2 00 kb. 



3 . 7. Polymorphic C m CGG methylation is not associated 
with any particular genomic feature in the 
reciprocal F1 triploids 
It has been previously shown that ~20% of genes are 
methylated in A. thaliana diploids. 34,35,37 We found 
that ~20% of genes contained either constitutive or 
polymorphicC m CGG methylation in the isogenic F1 tri- 
ploids. To test whether the constitutive and polymorph- 
ic C m CGG sites differed in their associations with other 
genomic features, we compared their distributions 
across coding sequences (CDS), introns, 5' UTRs, 3' 
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Figure 2. (A) Percentage methylation of constitutive (orange) and polymorphic (blue) methylation in a C m CGG context across the five chromosomes of A. thaliana. (B) LOWESS smoothing of 
constitutive methylation d scores using 200 kb discrete windows. LOWESS smoothing of constitutive methylation (orange), or a null distribution (black) obtained by shuffling by 1 kb 
blocks then LOWESS smoothing. (C) LOWESS smoothing of polymorphic methylation d scores using 200 kb discrete windows. LOWESS smoothing of polymorphic methylation (blue), 
or a null distribution (black) obtained by shuffling by 1 kb blocks then LOWESS smoothing. X-axis represents the position across the chromosomes. 
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Figure 3. Percentage of methylated sites (C m CGG) across genomic features for constitutive (orange) and polymorphic (blue) methylation. 
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Figure 4. Percentage methylation (C m CGG) across genie regions. (A) Constitutive methylation between reciprocal triploids and (B) 
polymorphic methylation between reciprocal triploids. Gene regions split into percentiles from 5' to 3' end of genes. Genes were further 
split into sizes ranges from < 1 , 1 -2, 2 and >3 kb. 



UTRs, areas 1 kb upstream or downstream of genes, and 
in intergenic regions (Fig. 3). The constitutive C m CGG 
methylation was found in all of the genomic features 
analysed, with some relative elevation in CDS and 
introns, and possible reduction in 5' UTR and 3' UTR 
(Fig. 3). The polymorphic C m CGG methylation was 
found to a similar extent across all genomic features 
(-6-7%). 



3.8. Constitutive (but not polymorphic) C m CGG gene 
body methylation in F1 triploids displays similar 
patterns to diploids 
While the role of gene body methylation has yet to be 

fully elucidated in plants, roles in transcriptional 



accuracy and splicing efficiency have been sug- 
gested, 34,40 and it appears that body-methylated 
genes are functionally important and display slower 
evolutionary rates in A. thaliana. 4 ^ It has previously 
been shown in A. thaliana diploids that genes contain- 
ing body methylation tend to be less methylated 
towards their 5' and 3' ends. 34,35,37 To determine 
whether this gene body methylation pattern was con- 
served in reciprocal F1 triploids, the distribution of 
both constitutive and polymorphic C m CGG sites 
across the gene body (CDS and introns) was plotted 
(UTRs were omitted as they are not annotated for all 
genes). Genes were further divided by gene length 
(i.e. <1, 1 -2, 2-3 and >3 kb) (Fig. 4A and B). Genes 
containingCCGG sites weredivided into 1 0 percentiles, 
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and the proportion of C m CGG methylation was calcu- 
lated for each percentile. 

The pattern of constitutive methylation observed in 
both reciprocal F1 triploidswassimilartothat observed 
in diploids, 34,35,37 with the 5' and 3' ends of body- 
methylated genes comparatively depleted in C m CGG 
methylation. This pattern is particularly pronounced 
for longer genes (Fig. 4A), again confirming a pattern 
observed in diploids. 37 Furthermore, this pattern is 
observed when m C (i.e. methylation in all contexts) is 
plotted across gene bodies. 33 Given that gene body 
methylation in A. thaliana is primarily in a CG 
context, 42 this suggests that the CCGG sites tested in 
this study are broadly representative of gene body CG 
methylation in A thaliana. 

However, the distribution of polymorphic C m CGG 
methylation across the gene body was rather different 
and was much more uniformly distributed across the 
gene body (Fig. 4B). There was also very little difference 
in the distribution of polymorphic C m CGG methylated 
sites across genes of different lengths. In summary, 
total constitutive gene body methylation of gene body 
regions is higher than that of polymorphic methylation 
overall (Fig. 3) and is particularly higher in the middle of 
the gene body, and the 3' end. In contrast, more C m CGG 
polymorphisms are found at the 5' end of genes when 
compared with constitutive methylation (Fig. 4A and 
B). We conclude that C m CGG methylation patterns are 
largely unchanged in isogenic reciprocal F1 triploids 
generated in either cross direction (when compared 
with diploids). 

A non-linear relationship between gene expression 
(absolute transcript levels) and gene body methylation 
has been observed in A. thaliana. 34,37 The least 
expressed genes and the most highly expressed genes 
are found to contain the lowest levels of gene body 
methylation, whereas genes expressed at intermediate 
levels contain the highest levels of gene body methyla- 
tion. 33,36 To test whether this pattern of gene body 
methylation is maintained in A. thaliana F1 triploids, 
we investigated the correlation between constitutive 
C m CGG methylation and absolute gene expression 
levels. All genes were divided into 20 percentiles cat- 
egories according to their absolute expression levels 
(see Materials and methods). Within each expression 
level percentile and for a range of gene annotation cat- 
egories (i.e. CDS, intron, 5' UTR, 3' UTR, up- and down- 
stream 500 bp regions), the number of genes 
containing constitutive C m CGG methylation site(s) 
was divided by those containing CCGG feature(s). For 
CDS and introns, the pattern of lowest and highest 
expressed genes having the lowest levels of constitutive 
C m CGG methylation was observed (Fig. 5). In contrast, 
C m CGG methylation within up- and downstream 
sequences, and in both UTRs, was generally low regard- 
less of transcript expression levels (Fig. 5). As this 
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Figure 5. Association between constitutive methylation (C m CGG) 
across genomic features (y-axis) and the absolute gene 
expression of associated genes (x-axis). Gene expression levels 
from all arrays [3x(m) and 3x(p)] were used and genes were split 
into percentiles. 

pattern between gene expression and methylation is 
evident, when m C methylation is measured 34,35 , it is a 
further indication that the CCGG sites measured in 
this study provide a good representation of the 
broader m CG methylation distribution. Although 
there remains the possibility that other important CG 
sites might be missed in analysis for particular loci. 



3.9. Gene expression differences between reciprocal F1 
isogenic triploids are not correlated with 
differential C m CGG methylation 
Toexaminethe relationship between C m CGG methy- 
lation polymorphisms and gene expression, gene 
expression d scores were linearly regressed against d 
scores for polymorphic methylation. Probes were 
mapped to genomic features CDS, introns, 5' UTRs 
and 3' UTRs, 1 000 bp up- and downstream regions 
(up- and downstream regions were tested in 1 00 bp 
intervals). No significant correlation was observed 
between gene expression levels and polymorphic 
C m CGG methylation at CDS, introns or downstream 
regions (P< 0.05). In contrast, a weak but significant 
negative correlation (r=0.13, P= 0.0039) was 
identified between gene expression and regions 
900-1 000 bp upstream of genes. 

Next, as most genes did not display any level of 
polymorphic methylation, we considered only those 
features found to have significant polymorphic methy- 
lation (either hypo- or hypermethylation), and com- 
pared their gene expression d scores with their 
polymorphic methylation d scores. Genomic regions 
hypomethylated in the 3x(p) were considered 
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separately from those regions hypermethylated in the 
3x(p). A moderate positive correlation between hypo- 
methylated regions 800-900 bp upstream of genes 
3x(p) and gene expression was identified (r=0.31, 
P= 0.0459). Overall, our results indicate that poly- 
morphic C m CGG methylation does not have a strong 
effect on gene expression in the reciprocal F1 triploids, 
and thatthechanges in gene regulation between pater- 
nal and maternal-excess triploids are not controlled by 
C m CGG methylation. 

3.10. 24-ht small RNAs differentially accumulate 
in up- and downstream regions 
Several species of small non-coding RNAs play a role 
in gene and transposable element regulation either by 
post-transcriptional gene silencing (miRNAs and tasi- 
RNAs) or by RdDM (24-ht RNAs). 43 In the case of 
RdDM, the accumulation of 24-ht RNAs at a locus can 
be considered an indicationof asymmetricCHH methy- 
lation at that locus (where H = C,AorT),asCHH methy- 
lation is reliant on RdDM. Using existing data of 3x(m) 
and 3x(p) small RNA sequencing from A. thaliana leaf 
tissue, 44 we investigated whether the differentially 
expressed genes between 3x(m) and 3x(p) displayed 



differential accumulation of either 21 or 24-ht RNAs 
across their genomic features (CDS, introns, 3' UTRs, 
5' UTRs, up- and downstream regions). 

For the genes we have as differentially expressed 
between the reciprocal F1 triploids, the number of 
accumulated 24-ht small RNAs was fewer in the up- 
stream regions of dysregulated genes in the 3x(p) 
(median: 2.59, IQR: 1.29-61.10) compared with 
3x(m) plants (median: 15.10, IQR: 1.89-91.52) 
(Fig. 6A), although the interquartile ranges a re com par- 
able, indicating that at least some upstream regions in 
the 3x(p) accumulate similar numbers of 24-ht RNAs 
compared with the upstream regions in the 3x(m) 
(Fig. 6A). Similarly, downstream regions of dysregulated 
genes were found to accumulate less 24-ht RNAs in the 
3x(p) (median: 2.59, IQR: 1.29-78.23) compared 
with 3x(m) (median: 16.98, IQR: 1.89-1 1 1.80). In 
contrast, the other genie features tested displayed 
much smaller variations in 24-ht RNA accumulation 
(Fig. 6A). Furthermore, the accumulation of 24-ht 
RNA remained consistent across 3x(m) and 3x(p) for 
each feature tested (Fig. 6B). 

To test whether these patterns of small RNA accumu- 
lation were specific for the differentially expressed genes 
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Figure 6. Boxplots of the distribution of small RNA accumulation across genomic features associated with genes differentially expressed 
between 3x(m) and 3x(p). n = the number of genes differentially expressed represented in each boxplot. (A) Boxplots of 24-ht small 
RNA accumulation. (B) Boxplots of 24-ht small RNA accumulation. 
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or whether they were consistent across all genes in the 
genome, the distribution of accumulation of 21- and 
24-ht small RNAs for non-differentially expressed 
genes were identified (Supplementary Fig. S3A and B). 
The accumulation of both 21- and 24-ht RNAs was 
found to be comparable between 3x(m) and 3x(p) for 
all features tested, which is consistent with the original 
analysis of the small RNA data. 44 These results indicate 
that the loss of 24-ht RNAs in leaves in the up- and 
downstream regions is associated with differentially 
expressed genes up-regulated in the paternal-excess 
triploid. Indeed, differences in 24-ht accumulation in 
up- and downstream regions of genes have previously 
been shown to contribute to divergence in gene expres- 
sion between two A. thaliana species. 45 Notably, 24-ht 
RNAs are known to be involved in de novo methylation, 
in particular asymmetric CHH methylation (via the 
RdDM pathway) 46 Hence, loss of 24-ht RNAs leading 
to loss of CHH methylation could be mechanistically 
involved in the up-regulation of loci gene expression in 
the paternal-excess F1 triploids. However, the small 
RNAdataset used here represents small RNA sequencing 
data from leaf tissues, which have comparative limita- 
tions. Further investigation on the distribution of small 
RNA in seedlings of reciprocal triploids will shed 
further light on the relationship between gene expres- 
sion and small RNA distribution in triploids. 



3.11. Conclusions 

Overall, this study demonstrates that the transcrip- 
tomes of reciprocal F1 triploids are non-equivalent, 
despite the genetically identical nature of maternal 
genome excess versus paternal genome excess F1 tri- 
ploids. This indicates that there are parent-of-origin- 
specific genome-dosage effects on the transcriptome of 
paternal-excess F1 triploids that could haveanepigenetic 
basis. Even though DNA methylation is one form of epi- 
genetic mark that has been widely associated with gene 
expression changes, our findings indicate that the paren- 
tal genome-dosage-dependent effects on gene expres- 
sion in paternal-excess F1 triploids are not associated 
with C m CGG methylation and may instead be associated 
with RdDM pathways involving 24-ht small RNAs that are 
associated with de novo CHH methylation. Overall, our 
study suggests that the paternally and maternally inher- 
ited chromosome sets in autopolyploid plants may be 
epigenetically different, due to parental genome- 
dosage effects that can affect transcript levels in a 
C m CGG methylation-independent manner. Such epigen- 
etic differences between reciprocal F1 triploids that are 
genetically identical have implications for our under- 
standing of triploidy and polyploidy in plants (and 
animals 47 ), including for fundamental and applied gen- 
etics of triploid and other autopolyploid crops. 
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